Here's one consideration: pow() functions usualy don't do multiple multiplications they are implemented as:
Code:
double pow( double x, double y )
{ return exp( y * log(x) ); }
This way, don't matter the base or the expoent, it takes, more or less, the same quantify of cycles (specially if you are dealing with f87 code). Here's a simple test for x86-64:
Code:
#include <stdint.h>
#include <inttypes.h>
#include <stdio.h>
#include <math.h>
static uint64_t local_tsc__;
#define BEGIN_TSC(...) do { \
uint32_t a, d; \
\
__asm__ __volatile__ ( \
"xorl %%eax,%%eax\n\t" \
"cpuid\n\t" \
"rdtsc" \
: "=a" (a), "=d" (d) :: "%rbx", "%rcx" \
); \
\
local_tsc__ = ((uint64_t)d << 32) | a; \
} while (0)
// rdtscp is available on all Haswell architectures.
#define END_TSC(c) do { \
uint32_t a, d; \
\
__asm__ __volatile__ ( \
"rdtscp" \
: "=a" (a), "=d" (d) :: "%rcx" \
); \
\
(c) = (((uint64_t)d << 32) | a) - local_tsc__; \
} while (0)
int main ( void )
{
int x, y;
double z;
uint64_t c;
scanf ( "%d %d", &x, &y );
BEGIN_TSC();
z = pow ( x, y );
END_TSC ( c );
printf ( "x^y = %g (%" PRIu64 " cycles)\n",
z, c );
return 0;
}
Compiling and linking (and running):
Code:
$ cc -O2 -o test test.c -lm
$ ./test <<< '50 10'
x^y = 9.76562e+16 (73704 cycles)
$ ./test <<< '500 100'
x^y = 7.88861e+269 (79402 cycles)
$ ./test <<< '5 300'
x1^y1 = 4.90909e+209 (62106 cycles)
$ ./test <<< '5000 1000'
x^y = inf (59896 cycles)
The variations of cycles are due to cache, interrupts, page faults and other issues, notice I am calculating with expoent 10, 100, 300 and 1000 and getting, more or less, the same time of calculation...
So, calling pow() n times will slow down your code n times...
PS: I know the equation is not correct for x < 0 and y fractionary... there are other tests made by pow() implementation.